On Parsing CHILDES

نویسنده

  • Aarre Laakso
چکیده

Research on child language acquisition would benefit from the availability of a large body of syntactically parsed utterances between parents and children. We consider the problem of generating such a “treebank” from the CHILDES corpus, which currently contains primarily orthographically transcribed speech tagged for lexical category.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I will shoot your shopping down and you can shoot all my tins---Automatic Lexical Acquisition from the CHILDES Database

Empirical data regarding the syntactic complexity of children’s speech is important for theories of language acquisition. Currently much of this data is absent in the annotated versions of the CHILDES database. In this perliminary study, we show that a state-ofthe-art subcategorization acquisition system of Preiss et al. (2007) can be used to extract largescale subcategorization (frequency) inf...

متن کامل

Parsing Hebrew CHILDES transcripts

We present a syntactic parser of (transcripts of) spoken Hebrew: a dependency parser of the Hebrew CHILDES database. CHILDES is a corpus of child–adult linguistic interactions. Its Hebrew section has recently been morphologically analyzed and disambiguated, paving the way for syntactic annotation. This paper describes a novel annotation scheme of dependency relations reflecting constructions of...

متن کامل

High-accuracy Annotation and Parsing of CHILDES Transcripts

Corpora of child language are essential for psycholinguistic research. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe an ongoing project that aims to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. To d...

متن کامل

Parsing the CHILDES Database: Methodology and Lessons Learned

This paper discusses the process of parsing adult utterances directed to a child, in an effort to produce a syntactically annotated corpus of the verbal input to a human language learner. In parsing the Eve corpus of the CHILDES database, we encountered several challenges relating to parser coverage and ambiguity, for which we describe solutions that result in a system capable of analyzing almo...

متن کامل

Incremental Grammar Induction from Child-Directed Dialogue Utterances

We describe a method for learning an incremental semantic grammar from data in which utterances are paired with logical forms representing their meaning. Working in an inherently incremental framework, Dynamic Syntax, we show how words can be associated with probabilistic procedures for the incremental projection of meaning, providing a grammar which can be used directly in incremental probabil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005